Search CORE

48 research outputs found

N-TERMINAL PROCESSING OF RIBOSOMAL PROTEIN L27 IN STAPHYLOCOCCUS AUREUS

Author: Caufield J. Harry
Publication venue: VCU Scholars Compass
Publication date: 07/05/2012
Field of study

The bacterial ribosome is essential to cell growth yet little is known about how its proteins attain their mature structures. Recent studies indicate that certain Staphlyococcus aureus bacteriophage protein sequences contain specific sites that may be cleaved by a non-bacteriophage enzyme (Poliakov et al. 2008). The phage cleavage site was found to bear sequence similarity to the N-terminus of S. aureus ribosomal protein L27. Previous studies in E. coli (Wower et al.1998; Maguire et al. 2005) found that L27 is situated adjacent to the ribosomal peptidyl transferase site, where it likely aids in new peptide formation. The predicted S. aureus L27 protein contains an additional N-terminal sequence not observed within the N-terminus of the otherwise similar E. coli L27; this sequence appears to be cleaved, indicating yet-unobserved ribosomal protein post-translational processing and use of host processes by phage. Phylogenetic analysis shows that L27 processing has the potential to be highly conserved. Further study of this phenomenon may aid antibiotic development

VCU Scholars Compass

Interactomics-Based Functional Analysis: Using Interaction Conservation To Probe Bacterial Protein Functions

Author: Caufield J. Harry
Publication venue: VCU Scholars Compass
Publication date: 01/01/2016
Field of study

The emergence of genomics as a discrete field of biology has changed humanity’s understanding of our relationship with bacteria. Sequencing the genome of each newly-discovered bacterial species can reveal novel gene sequences, though the genome may contain genes coding for hundreds or thousands of proteins of unknown function (PUFs). In some cases, these coding sequences appear to be conserved across nearly all bacteria. Exploring the functional roles of these cases ideally requires an integrative, cross-species approach involving not only gene sequences but knowledge of interactions among their products. Protein interactions, studied at genome scale, extend genomics into the field of interactomics. I have employed novel computational methods to provide context for bacterial PUFs and to leverage the rich genomic, proteomic, and interactomic data available for hundreds of bacterial species. The methods employed in this study began with sets of protein complexes. I initially hypothesized that, if protein interactions reveal protein functions and interactions are frequently conserved through protein complexes, then conserved protein functions should be revealed through the extent of conservation of protein complexes and their components. The subsequent analyses revealed how partial protein complex conservation may, unexpectedly, be the rule rather than the exception. Next, I expanded the analysis by combining sets of thousands of experimental protein-protein interactions. Progressing beyond the scope of protein complexes into interactions across full proteomes revealed novel evolutionary consistencies across bacteria but also exposed deficiencies among interactomics-based approaches. I have concluded this study with an expansion beyond bacterial protein interactions and into those involving bacteriophage-encoded proteins. This work concerns emergent evolutionary properties among bacterial proteins. It is primarily intended to serve as a resource for microbiologists but is relevant to any research into evolutionary biology. As microbiomes and their occupants become increasingly critical to human health, similar approaches may become increasingly necessary

VCU Scholars Compass

Protein Complexes in Bacteria

Author: Abreu Marco
Caufield J. Harry
Uetz Peter
Wimble Christopher
Publication venue: VCU Scholars Compass
Publication date: 01/01/2015
Field of study

Large-scale analyses of protein complexes have recently become available for Escherichia coli and Mycoplasma pneumoniae, yielding 443 and 116 heteromultimeric soluble protein complexes, respectively. We have coupled the results of these mass spectrometrycharacterized protein complexes with the 285 “gold standard” protein complexes identified by EcoCyc. A comparison with databases of gene orthology, conservation, and essentiality identified proteins conserved or lost in complexes of other species. For instance, of 285 “gold standard” protein complexes in E. coli, less than 10% are fully conserved among a set of 7 distantly-related bacterial “model” species. Complex conservation follows one of three models: well-conserved complexes, complexes with a conserved core, and complexes with partial conservation but no conserved core. Expanding the comparison to 894 distinct bacterial genomes illustrates fractional conservation and the limits of co-conservation among components of protein complexes: just 14 out of 285 model protein complexes are perfectly conserved across 95% of the genomes used, yet we predict more than 180 may be partially conserved across at least half of the genomes. No clear relationship between gene essentiality and protein complex conservation is observed, as even poorly conserved complexes contain a significant number of essential proteins. Finally, we identify 183 complexes containing well-conserved components and uncharacterized proteins which will be interesting targets for future experimental studies

Directory of Open Access Journals

PubMed Central

VCU Scholars Compass

FigShare

Bacterial protein meta-interactomes predict cross-species interactions and protein function

Author: Caufield J. Harry
Shary Semarjit
Uetz Peter
Wimble Christopher
Wuchty Stefan
Publication venue: VCU Scholars Compass
Publication date: 01/01/2017
Field of study

Background Protein-protein interactions (PPIs) can offer compelling evidence for protein function, especially when viewed in the context of proteome-wide interactomes. Bacteria have been popular subjects of interactome studies: more than six different bacterial species have been the subjects of comprehensive interactome studies while several more have had substantial segments of their proteomes screened for interactions. The protein interactomes of several bacterial species have been completed, including several from prominent human pathogens. The availability of interactome data has brought challenges, as these large data sets are difficult to compare across species, limiting their usefulness for broad studies of microbial genetics and evolution. Results In this study, we use more than 52,000 unique protein-protein interactions (PPIs) across 349 different bacterial species and strains to determine their conservation across data sets and taxonomic groups. When proteins are collapsed into orthologous groups (OGs) the resulting meta-interactome still includes more than 43,000 interactions, about 14,000 of which involve proteins of unknown function. While conserved interactions provide support for protein function in their respective species data, we found only 429 PPIs (~1% of the available data) conserved in two or more species, rendering any cross-species interactome comparison immediately useful. The meta-interactome serves as a model for predicting interactions, protein functions, and even full interactome sizes for species with limited to no experimentally observed PPI, including Bacillus subtilis and Salmonella enterica which are predicted to have up to 18,000 and 31,000 PPIs, respectively. Conclusions In the course of this work, we have assembled cross-species interactome comparisons that will allow interactomics researchers to anticipate the structures of yet-unexplored microbial interactomes and to focus on well-conserved yet uncharacterized interactors for further study. Such conserved interactions should provide evidence for important but yet-uncharacterized aspects of bacterial physiology and may provide targets for anti-microbial therapies

VCU Scholars Compass

Clinical Temporal Relation Extraction with Probabilistic Soft Logic Regularization and Global Inference

Author: Caufield J. Harry
Chang Kai-Wei
Han Rujun
Ping Peipei
Sun Yizhou
Wang Wei
Yan Yu
Zhou Yichao
Publication venue
Publication date: 16/12/2020
Field of study

There has been a steady need in the medical community to precisely extract the temporal relations between clinical events. In particular, temporal information can facilitate a variety of downstream applications such as case report retrieval and medical question answering. Existing methods either require expensive feature engineering or are incapable of modeling the global relational dependencies among the events. In this paper, we propose a novel method, Clinical Temporal ReLation Exaction with Probabilistic Soft Logic Regularization and Global Inference (CTRL-PG) to tackle the problem at the document level. Extensive experiments on two benchmark datasets, I2B2-2012 and TB-Dense, demonstrate that CTRL-PG significantly outperforms baseline methods for temporal relation extraction.Comment: 10 pages, 4 figures, 7 tables, accepted by AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

KG-Hub-building and exchanging biological knowledge graphs.

Author: Balhoff Jim
Bruskiewich Richard M
Callahan Tiffany J
Cappelletti Luca
Carbon Seth
Caufield J Harry
Chan Lauren E
Cortes Katherina
Elsarboukh Glass
Fontana Tommaso
Haendel Melissa A
Harris Nomi L
Hegde Harshad
Joachimiak Marcin P
Matentzoglu Nicolas
Moxon Sierra A T
Mungall Christopher J
Munoz-Torres Monica C
Putman Tim
Ravanmehr Vida
Reese Justin T
Robinson Peter N
Schaper Kevin
Shefchek Kent A
Thessen Anne E
Unni Deepak R
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/07/2023
Field of study

MOTIVATION: Knowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of KGs is lacking. RESULTS: Here we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of KGs. Features include a simple, modular extract-transform-load pattern for producing graphs compliant with Biolink Model (a high-level data model for standardizing biological data), easy integration of any OBO (Open Biological and Biomedical Ontologies) ontology, cached downloads of upstream data sources, versioned and automatically updated builds with stable URLs, web-browsable storage of KG artifacts on cloud infrastructure, and easy reuse of transformed subgraphs across projects. Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research. KG-Hub is equipped with tooling to easily analyze and manipulate KGs. KG-Hub is also tightly integrated with graph machine learning (ML) tools which allow automated graph ML, including node embeddings and training of models for link prediction and node classification. AVAILABILITY AND IMPLEMENTATION: https://kghub.org

The Jackson Laboratory: The Mouseion at the JAXlibrary

The Monarch Initiative in 2024: an analytic platform integrating phenotypes, genes and diseases across species.

Author: Alquaddoomi Faisal S
Braun Ian
Bruskiewich Richard M
Cappelletti Luca
Carbon Seth
Caron Anita R
Caufield J Harry
Chan Lauren E
Chute Christopher G
Cortes Katherina G
Cox Corey
De Souza Vinícius
Elsarboukh Glass
Fontana Tommaso
Gehrke Sarah
Haendel Melissa A
Harris Nomi L
Hartley Emily L
Hegde Harshad
Hurwitz Eric
Jacobsen Julius O B
Krishnamurthy Madan
Laraway Bryan J
Matentzoglu Nicolas
McLaughlin James A
McMurry Julie A
Moxon Sierra A T
Mullen Kathleen R
Mungall Christopher J
Munoz-Torres Monica C
O\u27Neil Shawn T
Osumi-Sutherland David
Putman Tim E
Reese Justin T
Robinson Peter N
Rubinetti Vincent P
Schaper Kevin
Shefchek Kent A
Smedley Damian
Stefancsik Ray
Toro Sabrina
Vasilevsky Nicole A
Walls Ramona L
Whetzel Patricia L
Publication venue: The Mouseion at the JAXlibrary
Publication date: 05/01/2024
Field of study

Bridging the gap between genetic variations, environmental determinants, and phenotypic outcomes is critical for supporting clinical diagnosis and understanding mechanisms of diseases. It requires integrating open data at a global scale. The Monarch Initiative advances these goals by developing open ontologies, semantic data models, and knowledge graphs for translational research. The Monarch App is an integrated platform combining data about genes, phenotypes, and diseases across species. Monarch\u27s APIs enable access to carefully curated datasets and advanced analysis tools that support the understanding and diagnosis of disease for diverse applications such as variant prioritization, deep phenotyping, and patient profile-matching. We have migrated our system into a scalable, cloud-based infrastructure; simplified Monarch\u27s data ingestion and knowledge graph integration systems; enhanced data mapping and integration standards; and developed a new user interface with novel search and graph navigation features. Furthermore, we advanced Monarch\u27s analytic tools by developing a customized plugin for OpenAI\u27s ChatGPT to increase the reliability of its responses about phenotypic data, allowing us to interrogate the knowledge in the Monarch graph using state-of-the-art Large Language Models. The resources of the Monarch Initiative can be found at monarchinitiative.org and its corresponding code repository at github.com/monarch-initiative/monarch-app

The Jackson Laboratory: The Mouseion at the JAXlibrary

The Human Phenotype Ontology in 2024: phenotypes around the world.

Author: Addo-Lartey Eunice B
Anagnostopoulos Anna V
Anderton Joel
Avillach Paul
Bagley Anita M
Bakštein Eduard
Balhoff James P
Baynam Gareth
Bello Susan M
Berk Michael
Bertram Holli
Bishop Somer
Blau Hannah
Bodenstein David F
Botas Pablo
Boztug Kaan
Callahan Tiffany J
Cameron Rhiannon
Carbon Seth J
Carmody Leigh
Castellanos Francisco
Caufield J Harry
Chan Lauren E
Chute Christopher G
Coleman Ben D
Cruz-Rojo Jaime
Dahan-Oliel Noémi
Danis Daniel
Davids Jon R
de Dieuleveult Maud
de Souza Vinicius
de Vries Bert B A
de Vries Esther
DePaulo J Raymond
Derfalvi Beata
Dhombres Ferdinand
Diaz-Byrd Claudia
Dingemans Alexander J M
Donadille Bruno
Duyzend Michael
Elfeky Reem
Essaid Shahim
Fabrizzi Carolina
Fico Giovanna
Firth Helen V
Freudenberg-Hua Yun
Fullerton Janice M
Gabriel Davera L
Gargano Michael
Gilmour Kimberly
Giordano Jessica
Goes Fernando S
Green Ian
Griese Matthias
Groza Tudor
Gu Weihong
Guthrie Julia
Gyori Benjamin
Haendel Melissa A
Hamosh Ada
Hanauer Marc
Hanušová Kateřina
Harris Nomi L
He Yongqun Oliver
Hegde Harshad
Helbig Ingo
Holasová Kateřina
Hoyt Charles Tapley
Huang Shangzhi
Hurwitz Eric
Jacobsen Julius O B
Jiang Xiaofeng
Joseph Lisa
Keramatian Kamyar
King Bryan
Knoflach Katrin
Koolen David A
Kraus Megan L
Kroll Carlo
Kusters Maaike
Köhler Sebastian
Ladewig Markus S
Lagorce David
Lai Meng-Chuan
Lapunzina Pablo
Laraway Bryan
Lewis-Smith David
Li Xiarong
Lucano Caterina
Majd Marzieh
Marazita Mary L
Martinez-Glez Victor
Matentzoglu Nicolas
McHenry Toby H
McInnis Melvin G
McMurry Julie A
Mihulová Michaela
Millett Caitlin E
Mitchell Philip B
Moses Rachel Gore
Moslerová Veronika
Mungall Christopher J
Munoz-Torres Monica C
Narutomi Kenji
Nematollahi Shahrzad
Nevado Julian
Nierenberg Andrew A
Nurnberger John I
Ogishima Soichi
Olson Daniel
Ortiz Abigail
Pachajoa Harry
Perez de Nanclares Guiomar
Peters Amy
Putman Tim
Rapp Christina K
Rath Ana
Reese Justin
Rekerle Lauren
Roberts Angharad M
Robinson Peter N
Roy Suzy
Sanders Stephan J
Schuetz Catharina
Schulte Eva C
Schulze Thomas G
Schwarz Martin
Scott Katie
Seelow Dominik
Seitz Berthold
Shen Yiping
Similuk Morgan N
Simon Eric S
Singh Balwinder
Smedley Damian
Smith Cynthia
Smolinsky Jake T
Sperry Sarah
Stafford Elizabeth
Stefancsik Ray
Steinhaus Robin
Strawbridge Rebecca
Sundaramurthi Jagadish Chandrabose
Talapova Polina
Tenorio Castano Jair A
Tesner Pavel
Thomas Rhys H
Thurm Audrey
Toro Sabrina
Turnovec Marek
van Gijn Marielle E
Vasilevsky Nicole A
Vlčková Markéta
Walden Anita
Wang Kai
Wapner Ron
Ware James S
Wiafe Addo A
Wiafe Samuel A
Wiggins Lisa D
Williams Andrew E
Wu Chen
Wyrwoll Margot J
Xiong Hui
Yalin Nefize
Yamamoto Yasunori
Yatham Lakshmi N
Yocum Anastasia K
Young Allan H
Yüksel Zafer
Zandi Peter P
Zankl Andreas
Zarante Ignacio
Zvolský Miroslav
Čady Jolana
Čajbiková Nikola Novák
Publication venue: The Mouseion at the JAXlibrary
Publication date: 05/01/2024
Field of study

The Human Phenotype Ontology (HPO) is a widely used resource that comprehensively organizes and defines the phenotypic features of human disease, enabling computational inference and supporting genomic and phenotypic analyses through semantic similarity and machine learning algorithms. The HPO has widespread applications in clinical diagnostics and translational research, including genomic diagnostics, gene-disease discovery, and cohort analytics. In recent years, groups around the world have developed translations of the HPO from English to other languages, and the HPO browser has been internationalized, allowing users to view HPO term labels and in many cases synonyms and definitions in ten languages in addition to English. Since our last report, a total of 2239 new HPO terms and 49235 new HPO annotations were developed, many in collaboration with external groups in the fields of psychiatry, arthrogryposis, immunology and cardiology. The Medical Action Ontology (MAxO) is a new effort to model treatments and other measures taken for clinical management. Finally, the HPO consortium is contributing to efforts to integrate the HPO and the GA4GH Phenopacket Schema into electronic health records (EHRs) with the goal of more standardized and computable integration of rare disease data in EHRs

The Jackson Laboratory: The Mouseion at the JAXlibrary